Nice latent variable models have log-rank
نویسندگان
چکیده
Matrices of low rank are pervasive in big data, appearing in recommender systems, movie preferences, topic models, medical records, and genomics. While there is a vast literature on how to exploit low rank structure in these datasets, there is less attention on explaining why the low rank structure appears in the first place. We explain the abundance of low rank matrices in big data by proving that certain latent variable models associated to piecewise analytic functions are of log-rank. A large matrix from such a latent variable model can be approximated, up to a small error, by a low rank matrix.
منابع مشابه
Sandwiching the marginal likelihood using bidirectional Monte Carlo
Computing the marginal likelihood (ML) of a model requires marginalizing out all of the parameters and latent variables, a difficult high-dimensional summation or integration problem. To make matters worse, it is often hard to measure the accuracy of one’s ML estimates. We present bidirectional Monte Carlo, a technique for obtaining accurate log-ML estimates on data simulated from a model. This...
متن کاملBayesian Learning for Low-Rank matrix reconstruction
We develop latent variable models for Bayesian learning based low-rank matrix completion and reconstruction from linear measurements. For under-determined systems, the developed methods are shown to reconstruct low-rank matrices when neither the rank nor the noise power is known a-priori. We derive relations between the latent variable models and several low-rank promoting penalty functions. Th...
متن کاملDeterminants of Inflation in Selected Countries
This paper focuses on developing models to study influential factors on the inflation rate for a panel of available countries in the World Bank data base during 2008-2012. For this purpose, Random effect log-linear and Ordinal logistic models are used for the analysis of continuous and categorical inflation rate variables. As the original inflation rate response to variables shows an appar...
متن کاملNICE: Non-linear Independent Components Estimation
We propose a deep learning framework for modeling complex high-dimensional densities via Nonlinear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transfo...
متن کاملIdentification of discrete concentration graph models with one hidden binary variable
Conditions are presented for different types of identifiability of discrete variable models generated over an undirected graph in which one node represents a binary hidden variable. These models can be seen as extensions of the latent class model to allow for conditional associations between the observable random variables. Since local identification corresponds to full rank of the parametrizat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1705.07474 شماره
صفحات -
تاریخ انتشار 2017